On Large-Scale Retrieval: Binary or n-ary Coding?

نویسندگان

  • Mahyar Najibi
  • Mohammad Rastegari
  • Larry S. Davis
چکیده

The growing amount of data available in modern-day datasets makes the need to efficiently search and retrieve information. To make large-scale search feasible, Distance Estimation and Subset Indexing are the main approaches. Although binary coding has been popular for implementing both techniques, n-ary coding (known as Product Quantization) is also very effective for Distance Estimation. However, their relative performance has not been studied for Subset Indexing. We investigate whether binary or n-ary coding works better under different retrieval strategies. This leads to the design of a new n-ary coding method, ”Linear Subspace Quantization (LSQ)” which, unlike other n-ary encoders, can be used as a similarity-preserving embedding. Experiments on image retrieval show that when Distance Estimation is used, n-ary LSQ outperforms other methods. However, when Subset Indexing is applied, interestingly, binary codings are more effective and binary LSQ achieves the best accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fresh Look at Coding for q-ary Symmetric Channels

This paper studies coding schemes for the q-ary symmetric channel based on binary low-density parity-check (LDPC) codes that work for any alphabet size q = 2, m ∈ N, thus complementing some recently proposed packet-based schemes requiring large q. First, theoretical optimality of a simple layered scheme is shown, then a practical coding scheme based on a simple modification of standard binary L...

متن کامل

The fuzzy set model based on N-ary positively compensatory operators

We have enhanced the fuzzy set model by replacing MIN and MAX operators with binary positively compensatory operators. Though the binary operators provide higher retrieval eeectiveness, they can give diierent document values for logically equivalent queries, e.g. t1 AND (t2 AND t3) and (t1 AND t2) AND t3. This is because they do not satisfy the basic boolean processing laws such as distributive...

متن کامل

Enumeration of sequences with large alphabets

A binary sequence of length n with w ones can be identified by its lexicographical rank in the set of all binary sequences with same number of ones and zeros, which is of size n! w!·(n−w)! . Although that enumeration has been deeply studied for binary case, it is less addressed for σ-ary sequences, where σ > 2. Assuming n is a fixed predetermined parameter, the enumerative coding of a given n-s...

متن کامل

Deep Hashing Network for Efficient Similarity Retrieval

Due to the storage and retrieval efficiency, hashing has been widely deployed to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing, which improves the quality of hash coding by exploiting the semantic similarity on data pairs, has received increasing attention recently. For most existing supervised hashing methods for image retrieval, an image is first...

متن کامل

Modified Gray-Level Coding Method for Absolute Phase Retrieval

Fringe projection systems have been widely applied in three-dimensional (3D) shape measurements. One of the important issues is how to retrieve the absolute phase. This paper presents a modified gray-level coding method for absolute phase retrieval. Specifically, two groups of fringe patterns are projected onto the measured objects, including three phase-shift patterns for the wrapped phase, an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1509.06066  شماره 

صفحات  -

تاریخ انتشار 2015